Diel.Niche Vignette

Author: Brian Gerber

Date: 2023-05-25

Introducion

The Diel.Niche package is used to evaluate how animals use the three fundamental periods of light: twilight (dawn & dusk), daytime, and nighttime. First, install and load the library

# Install package from GitHub via devtools.
# devtools::install_github("diel-project/Diel-Niche-Modeling",ref="main")

# Load the pacakge
  library(Diel.Niche)

The packages specifies many diel hypotheses, which together are defined into five hypothesis sets. The code names for the hypotheses and sets are,

what.hyp()
#> Traditional :  D N CR C  
#> 
#>  General :  D N CR C2 D.CR D.N CR.N 
#>  
#>  Threshold :  D.th N.th CR.th C.th 
#>  
#>  Maximizing:  D.max N.max CR.max 
#>  
#>  Variation:  D.var N.var CR.var C.var 
#> 

A short description of each hypothesis is available by calling,

  what.hyp("CR")
#> [1] "Traditional/General Crepuscular"

To list the hypotheses of a given set,

  hyp.sets("Traditional")
#> [1] "D"  "N"  "CR" "C"

  hyp.sets("General")
#> [1] "D"    "N"    "CR"   "C2"   "D.CR" "D.N"  "CR.N"

The best way to understand each hypothesis set is by plotting hypotheses from a set.

Maximizing

This hypothesis set is useful in evaluating which diel phenotype (Diurnal, Nocturnal, Crepuscular) is used the most. There is no hypothesis associated to cathemerality, which is a combination of multiple diel periods

diel.plot(hyp=hyp.sets("Maximizing"))

Traditional

This hypothesis set uses the traditional combinations of the diel modalities of diurnal, nocturnal, crepuscular, and cathemeral.


diel.plot(hyp=hyp.sets("Traditional"))

If left unspecified, default values are used to define the boundaries among these different hypotheses. The default value for being diurnal, nocturnal, or crepuscular is \(\ge\) 0.80 probability. Cathemerality is defined as no diel period activity has more than 0.80 probability and two or more time periods have\(\ge\) 0.1 probability.

General

This hypothesis set aims to provide more specificity tha the Traditional hypothesis set. It includes the Traditional definitions of diurnal, nocturnal, and crepuscular, as above, but cathemerality is defined more specifically in terms of whether two or three diel periods are used moderately.

diel.plot(hyp=hyp.sets("General"))

Threshold

This hypothesis set is based on defining each diel period in terms of lower thresholds. If the thresholds are not provided, default values are used.

diel.plot(hyp=hyp.sets("Threshold"))

Not that where there is no colored space indicates that these set of probabilities are not considered.

Variation

This hypothesis set is based on defining the ranges (lower, uppper) of possible probability values.

diel.plot(hyp=hyp.sets("Variation"))

We can also view specific hypotheses, as

diel.plot("D",more.points = TRUE)

Simulating Data

To simulate data based on a diel hypothesis, we need define a hypothesis using its coded name. Pick a hypothesis to simulate data from and how many samples.

  set.seed(45451)
  hyp=c("CR")
  sim=sim.diel(hyp=hyp,n.sample=100)
  
  #The probability value used to simulate data
  p=sim$p
  p
#>       [,1]  [,2] [,3]
#> [1,] 0.855 0.025 0.12
  
  #The simulated data
  y=sim$y
  y
#>      y_crep y_day y_night
#> [1,]     89     1      10

Setup hypothesis set and fit models

You can setup your own model set by creating a vector of coded hypothesis names or use the pre-defined model sets.

  hyp.set = c("D","CR") #manual

  hyp.set = hyp.sets("Traditional") #pre-defined

To fit the data to the hypothesis set,

  out = diel.fit(y,hyp.set)
#> Data checks Complete.
#> Calculating Bayes Factors...
#> The most supported model is: 
#>  Crepuscular (Traditional)

#Call the model probabilities for each hypothesis in the set
  out$bf.table
#>    Prior    Posterior
#> D   0.25 0.0000000000
#> N   0.25 0.0000000000
#> CR  0.25 0.9992042477
#> C   0.25 0.0007957523

To examine the available outputs from the fitted model object,

attributes(out)
#> $names
#>  [1] "bf.table"           "post.samp"          "ms.model"          
#>  [4] "ppc"                "ms.ppc"             "post.samp.ms.model"
#>  [7] "y"                  "y.vec"              "gelm.diag"         
#> [10] "ms.gelm.diag"       "bf.list"

?diel.fit
#> starting httpd help server ... done

The function ‘diel.fit’ defaults to providing the model probabilities for each hypothesis set, but not the posterior samples of the parameters for each hypothesis. We can change this as,

  out = diel.fit(y,hyp.set,n.chains=2,post.fit = TRUE)
#> Data checks Complete.
#> Calculating Bayes Factors...
#> Posterior Sampling...
#> The most supported model is: 
#>  Crepuscular (Traditional)

We can look at a convergence criteria for each hypothesis. Note that convergence can be very poor for models that do not meet the inequality constraints well.

  out$gelm.diag
#> $D
#> Potential scale reduction factors:
#> 
#>      Point est. Upper C.I.
#> p1_1      0.999          1
#> p1_2      1.001          1
#> 
#> 
#> $N
#> Potential scale reduction factors:
#> 
#>      Point est. Upper C.I.
#> p1_1        NaN        NaN
#> p1_2          1       1.01
#> 
#> 
#> $CR
#> Potential scale reduction factors:
#> 
#>      Point est. Upper C.I.
#> p1_1      1.003       1.02
#> p1_2      0.999       1.00
#> 
#> 
#> $C
#> Potential scale reduction factors:
#> 
#>      Point est. Upper C.I.
#> p1_1          1       1.00
#> p1_2          1       1.02

We can plot the posterior samples from the most supported model to check convergence/mixing as,

 plot(coda::as.mcmc(out$post.samp.ms.model))

The posterior samples for all hypotheses are available in a list.

 names(out$post.samp)
#> [1] "D"  "N"  "CR" "C"

#For each of these list is a list of chains

length(out$post.samp[[1]])
#> [1] 2

#Here are the means of the posterior samples of all hypotheses for chain 1
lapply(out$post.samp,FUN=function(x){colMeans(x[[1]])})
#> $D
#>   p_crep_1    p_day_1  p_night_1 
#> 0.17646753 0.80201824 0.02151423 
#> 
#> $N
#>  p_crep_1   p_day_1 p_night_1 
#> 0.0000000 0.1061735 0.8938265 
#> 
#> $CR
#>   p_crep_1    p_day_1  p_night_1 
#> 0.87862454 0.01813177 0.10324369 
#> 
#> $C
#>   p_crep_1    p_day_1  p_night_1 
#> 0.78413959 0.03457643 0.18128398

Plotting

Using the packages bayesplot and ggplot2, we can examine our posterior distributions along with the true probabilities values,

library(ggplot2)
library(bayesplot)
#> This is bayesplot version 1.10.0
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#>    * Does _not_ affect other ggplot2 plots
#>    * See ?bayesplot_theme_set for details on theme setting

posteriors=coda::as.mcmc(out$post.samp.ms.model)

plot_title <- ggtitle("Posterior distributions",
                      "with medians and 80% intervals")
mcmc_areas(posteriors, prob = 0.8) + plot_title+ 
  geom_vline(xintercept=p[1], linetype="dashed",color = c("red"), size=1)+
  geom_vline(xintercept=p[2], linetype="dashed",color = c("purple"), size=1)+
  geom_vline(xintercept=p[3], linetype="dashed",color = c("green"), size=1)
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

Another way to examine our hypothesis is to plot the theoretical niche space for the hypothesis along with the posterior samples from the most supported model.

diel.plot(hyp=out$ms.model,more.points = TRUE,
          posteriors=out$post.samp.ms.model)

Or, we can plot the whole hypothesis set

diel.plot(hyp=hyp.set,
          posteriors=out$post.samp.ms.model)

User-specified Hypotheses

You do not need to rely on pre-set definitions of diel hypotheses. If you are interested in categorizing a diel phenotype based on 0.90 probability or higher, you can adjust the hypotheses using the function ‘diel.ineq()’ than pass this object to the plot function or model fitting function.

  diel.setup=diel.ineq(xi.min.dom = c(0.9))
  diel.plot(hyp=hyp.sets("Traditional"),diel.setup=diel.setup)

If you want to separate the hypotheses and reduce the parameter space of cathmerality,

diel.setup=diel.ineq(xi.min.dom = 0.7,separation=0.1)
diel.plot(hyp=hyp.sets("Traditional"),diel.setup=diel.setup)

When considering the General hypothesis set, you can adjust the lower probability value to define crepuscular, diurnal, and nocturnal, as well as adjust the definitions of Cathemeral General vs Diurnal-Nocturnal, Crepuscular-Nocturnal, and Diurnal-Crepuscular.

diel.setup=diel.ineq(xi.min.dom = c(0.9,0.05))
diel.plot(hyp=hyp.sets("General"),diel.setup=diel.setup)

This new hypothesis set can be used to simulate data and fit the same models by passing the object diel.setup as an attribute of the functions sim.diel, diel.fit, and diel.plot.


y=sim.diel(hyp="D.CR",n.sample = 200,diel.setup=diel.setup)$y

out = diel.fit(y,hyp.set=hyp.sets("General"),n.chains=2,post.fit = TRUE,diel.setup=diel.setup)
#> Data checks Complete.
#> Calculating Bayes Factors...
#> Posterior Sampling...
#> The most supported model is: 
#>  Diurnal-Crepuscular (General)

diel.plot(hyp=hyp.sets("General"),posteriors = out$post.samp.ms.model,diel.setup=diel.setup)

The Unconstrained Model

If you are interested in comparing the constrained hypotheses to an unconstrained model, you can do that by adding it into the hypothesis set

hyp.set.new=c(hyp.sets("General"),"Uncon")
out = diel.fit(y,hyp.set=hyp.set.new,n.chains=2,post.fit = FALSE,diel.setup=diel.setup)
#> Data checks Complete.
#> Calculating Bayes Factors...
#> The most supported model is: 
#>  Diurnal-Crepuscular (General)

#Compare the uncosntrained model to the rest of the models of the General hypothesis set
out$bf.table
#>       Prior  Posterior
#> D     0.125 0.00000000
#> N     0.125 0.00000000
#> CR    0.125 0.04536500
#> C2    0.125 0.01508754
#> D.CR  0.125 0.85789045
#> D.N   0.125 0.00000000
#> CR.N  0.125 0.00000000
#> Uncon 0.125 0.08165700

Data availabilty and an unequal prior

The package includes a set of available data that can be accessed as

head(diel.data)
#>           scientificName twilight day night trap_nights nsite  min_date
#> 1          Canis latrans        6  14    20         733   131 6/13/2013
#> 2   Didelphis virginiana       21   2    86         733   131 6/13/2013
#> 3      Mephitis mephitis        5   2    24         733   131 6/13/2013
#> 4 Odocoileus virginianus        3  76    29         733   131 6/13/2013
#> 5          Procyon lotor       42  45   201         733   131 6/13/2013
#> 6   Sciurus carolinensis        0 198     4         733   131 6/13/2013
#>   max_date mean_lat mean_lon season       country   phylum    class
#> 1 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 2 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 3 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 4 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 5 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#> 6 7/9/2013 41.87236 -87.8423 Summer United States Chordata Mammalia
#>             order      family             Project unit_type
#> 1       Carnivora     Canidae UWIN_Chicago_IL_USA     28day
#> 2 Didelphimorphia Didelphidae UWIN_Chicago_IL_USA     28day
#> 3       Carnivora  Mephitidae UWIN_Chicago_IL_USA     28day
#> 4    Artiodactyla    Cervidae UWIN_Chicago_IL_USA     28day
#> 5       Carnivora Procyonidae UWIN_Chicago_IL_USA     28day
#> 6        Rodentia   Sciuridae UWIN_Chicago_IL_USA     28day
#>             Common_name Activity_Literature
#> 1                Coyote          Cathemeral
#> 2      Virginia Opossum           Nocturnal
#> 3         Striped Skunk           Nocturnal
#> 4     White-tailed Deer         Crepuscular
#> 5      Northern Raccoon           Nocturnal
#> 6 Eastern Gray Squirrel             Diurnal
?diel.data

Here, we will use one of the data sets to fit a model with 1) equal weight on each hypotheses and 2) prior weight that is higher on the literature definition.

# Virginia Opossum data from the Urban Wildlife Information Network
dat=diel.data[2,]

dat
#>         scientificName twilight day night trap_nights nsite  min_date max_date
#> 2 Didelphis virginiana       21   2    86         733   131 6/13/2013 7/9/2013
#>   mean_lat mean_lon season       country   phylum    class           order
#> 2 41.87236 -87.8423 Summer United States Chordata Mammalia Didelphimorphia
#>        family             Project unit_type      Common_name
#> 2 Didelphidae UWIN_Chicago_IL_USA     28day Virginia Opossum
#>   Activity_Literature
#> 2           Nocturnal

y=data.frame(twilight=dat$twilight,
             day=dat$day, 
             night=dat$night)
rownames(y)=dat$Common_name

y
#>                  twilight day night
#> Virginia Opossum       21   2    86

# Model Comparison using equal eights
out1 = diel.fit(as.matrix(y),hyp.set=hyp.sets("Traditional"),post.fit = FALSE)
#> Data checks Complete.
#> Calculating Bayes Factors...
#> The most supported model is: 
#>  Nocturnal (Traditional)

# Model Comparison using 0.8 probability weigh on the nocturnal hypothesis
out2 = diel.fit(as.matrix(y),hyp.set=hyp.sets("Traditional"),post.fit = FALSE,
                prior=c(0.2/3,0.8,0.2/3,0.2/3))
#> Data checks Complete.
#> Calculating Bayes Factors...
#> The most supported model is: 
#>  Nocturnal (Traditional)

# Results out1
round(out1$bf.table,digits=2)
#>    Prior Posterior
#> D   0.25      0.00
#> N   0.25      0.91
#> CR  0.25      0.00
#> C   0.25      0.09

# Results out 2
round(out2$bf.table,digits=2)
#>    Prior Posterior
#> D   0.07      0.00
#> N   0.80      0.99
#> CR  0.07      0.00
#> C   0.07      0.01

We see that the model weight on the nocturnal hypotheis increase because of our prior weight.

Let’s consider another case where we simulate 10 data sets with differing sample sizes to see how the prior weight affects the model weights. Specifically, we will put a high weight (0.95) on the Cathemeral Traditional hypothesis while the simulated data is from the Nocturnal hypothesis.

set.seed(451)
n=matrix(seq(10,200, 20))
sim.dat=t(apply(n,1,FUN=function(x){sim.diel(n.sim=1, n.sample=x, hyp = "N")$y}))


prior.func=function(x,prior){out= diel.fit(t(as.matrix(x)),hyp.set=hyp.sets("Traditional"),
                                    post.fit = FALSE, print=FALSE,
                                    prior=prior)
                                    out$bf.table[,2]
}

#Equal priors
results.prior1=apply(sim.dat,1,prior=c(0.25,0.25,0.25,0.25),
                     FUN=prior.func)

# Unequal priors
results.prior2=apply(sim.dat,1,prior=c(0.05/3,0.05/3,0.05/3,0.95),
                     FUN=prior.func)

#par(mfrow=c(2,1))
matplot(t(results.prior1),type="l",lwd=3,ylab="Model Probability",xaxt="n")
axis(1,at=1:length(n),lab=n)
legend("right", legend=rownames(results.prior2),col = 1:4,lwd=3,lty = 1:4)


matplot(t(results.prior2),type="l",lwd=3,ylab="Model Probability",xaxt="n")
axis(1,at=1:length(n),lab=n)
legend("right", legend=rownames(results.prior2),col = 1:4,lwd=3,lty = 1:4)

The differences in the model probabilieis are entirely due to the prior weights on each hypothesis. In the second plot, the likelihood of the data is trying to overwhelm such a highly certain prior.

Plotting Issues in RStudio

Plotting is done using the package plotly. Plotly can have issues with RStudio. If you are using RStudio and no figures are opening then: Tools–>Global Options–>Advanced–>Rendering Engine Choose “Desktop OpenGL{} and then restart RStudio.